Pointer Sentinel Mixture Models

نویسندگان

  • Stephen Merity
  • Caiming Xiong
  • James Bradbury
  • Richard Socher
چکیده

Recent neural network sequence models with softmax classifiers have achieved their best language modeling performance only with very large hidden states and large vocabularies. Even then they struggle to predict rare or unseen words even if the context makes the prediction unambiguous. We introduce the pointer sentinel mixture architecture for neural sequence models which has the ability to either reproduce a word from the recent context or produce a word from a standard softmax classifier. Our pointer sentinelLSTM model achieves state of the art language modeling performance on the Penn Treebank (70.9 perplexity) while using far fewer parameters than a standard softmax LSTM. In order to evaluate how well language models can exploit longer contexts and deal with more realistic vocabularies and larger corpora we also introduce the freely available WikiText corpus.1

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayesian estimation of dynamic finite mixtures

The paper introduces an algorithm for estimation of dynamic mixture models. A new feature of the proposed algorithm is the ability to consider a dynamic form not only for component models but also for the pointer model, which describes the activities of the mixture components in time. The pointer model is represented by a table of transition probabilities that stochastically control the switchi...

متن کامل

Code Completion with Neural Attention and Pointer Networks

Intelligent code completion has become an essential tool to accelerate modern software development. To facilitate effective code completion for dynamically-typed programming languages, we apply neural language models by learning from large codebases, and investigate the effectiveness of attention mechanism on the code completion task. However, standard neural language models even with attention...

متن کامل

Sentinel Node Biopsy for the Head and Neck Using Contrast-Enhanced Ultrasonography Combined with Indocyanine Green Fluorescence in Animal Models: A Feasibility Study

BACKGROUND Sentinel node navigation surgery is gaining popularity in oral cancer. We assessed application of sentinel lymph node navigation surgery to pharyngeal and laryngeal cancers by evaluating the combination of contrast-enhanced ultrasonography and indocyanine green fluorescence in animal models. METHODS This was a prospective, nonrandomized, experimental study in rabbit and swine anima...

متن کامل

Volumetric soil moisture estimation using Sentinel 1 and 2 satellite images

Surface soil moisture is an important variable that plays a crucial role in the management of water and soil resources. Estimating this parameter is one of the important applications of remote sensing. One of the remote sensing techniques for precise estimation of this parameter is data-driven models. In this study, volumetric soil moisture content was estimated using data-driven models, suppor...

متن کامل

Logfile Failure Prediction using Recurrent and Quasi Recurrent Neural Networks

Inumerable man hours are spent waiting for running programs to finish and then debugging them when they crash by analyzing the logfile. Often times logfiles will contain a plethora of text extraneous to debugging, providing for a tedious undertaking. Therefore, we present an application of state-of-the-art natural language processing with neural networks to classify a run’s final outcome prior ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1609.07843  شماره 

صفحات  -

تاریخ انتشار 2016